machine-readable text

Học thuật
Thân thiện
machine-readable text

A scanner converts printed documents into machine-readable text.

Definition

Noun: * Electronic text stored as character strings: Text that is encoded in a format a computer can process directly. It is stored as sequences of characters (letters, numbers, symbols) and can be displayed, searched, or manipulated in various ways by software.

Usage

This term is used to describe digital text data that is not a mere image or scan of text, but is composed of actual character codes that software can interpret. * The library is converting its archives into machine-readable text for online databases. * Optical Character Recognition (OCR) software turns scanned documents into machine-readable text. * A plain .txt file is a common form of machine-readable text.

Advanced Usage
  • Distinction from "human-readable": While all text is human-readable, "machine-readable" specifically emphasizes the text's format being parsable by a computer. A PDF might be human-readable on screen, but unless the text layer is accessible, it may not be fully machine-readable for data extraction.
  • In Data Standards: Often discussed in contexts of open data, accessibility, and data interchange, where information must be provided in a structured, machine-parsable format (e.g., XML, JSON, CSV) rather than as an image or unstructured document.
Variants and Related Words
  • Plain text (n): A basic type of machine-readable text containing only characters, with no formatting data like fonts or colors.
  • Structured text (n): Machine-readable text organized with tags or markers (e.g., HTML, XML) to define its structure and meaning.
  • Digital text (n): A broader term for any textual content in digital form, which may or may not be easily machine-readable.
Synonyms
  • Encoded text
  • Parsable text
  • Digital text data
Related Concepts (Not Phrasal Verbs or Idioms)
  • OCR (Optical Character Recognition): The technology process that creates machine-readable text from images.
  • Character encoding: The system (e.g., ASCII, Unicode) that defines how characters are represented as numbers for machines to read.
machine-readable text

A scanner converts printed documents into machine-readable text.

Noun
  1. electronic text that is stored as strings of characters and that can be displayed in a variety of formats